CLEAN rewards for improving multiagent coordination in the presence of exploration
نویسندگان
چکیده
In cooperative multiagent systems, coordinating the jointactions of agents is difficult. One of the fundamental difficulties in such multiagent systems is the slow learning process where an agent may not only need to learn how to behave in a complex environment, but may also need to account for the actions of the other learning agents. Here, the inability of agents to distinguish the true environmental dynamics from those caused by the stochastic exploratory actions of other agents creates noise on each agent’s reward signal. To address this, we introduce Coordinated Learning without Exploratory Action Noise (CLEAN) rewards, which are agent-specific shaped rewards that effectively remove such learning noise from each agent’s reward signal. We demonstrate their performance with up to 1000 agents in a standard congestion problem.
منابع مشابه
Counterfactual Exploration for Improving Multiagent Learning
In any single agent system, exploration is a critical component of learning. It ensures that all possible actions receive some degree of attention, allowing an agent to converge to good policies. The same concept has been adopted by multiagent learning systems. However, there is a fundamentally different dynamic in multiagent learning: each agent operates in a non-stationary environment, as a d...
متن کاملFeature Selection as a Multiagent Coordination Problem
Datasets with hundreds to tens of thousands features is the new norm. Feature selection constitutes a central problem in machine learning, where the aim is to derive a representative set of features from which to construct a classification (or prediction) model for a specific task. Our experimental study involves microarray gene expression datasets; these are high-dimensional and noisy datasets...
متن کاملCombining reward shaping and hierarchies for scaling to large multiagent systems
Coordinating the actions of agents in multiagent systems presents a challenging problem, especially as the size of the system is increased and predicting the agent interactions becomes difficult. Many approaches to improving coordination within multiagent systems have been developed including organizational structures, shaped rewards, coordination graphs, heuristic methods, and learning automat...
متن کاملCLEANing the Reward: Counterfactual Actions Remove Exploratory Action Noise in Multiagent Learning
Coordinating the joint-actions of agents in cooperative multiagent systems is a difficult problem in many real world domains. Learning in such multiagent systems can be slow because an agent may not only need to learn how to behave in a complex environment, but also to account for the actions of other learning agents. The inability of an agent to distinguish between the true environmental dynam...
متن کاملExploiting structure and utilizing agent-centric rewards to promote coordination in large multiagent systems
A goal within the field of multiagent systems is to achieve scaling to large systems involving hundreds or thousands of agents. In such systems the communication requirements for agents as well as the individual agents’ ability to make decisions both play critical roles in performance. We take an incremental step towards improving scalability in such systems by introducing a novel algorithm tha...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013